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Abstract 

We analyze a fairly standard idealization of Pollard's Rho algorithm for finding the dis- 
crete logarithm in a cyclic group G. It is found that, with high probability, a collision oc- 



curs in 0(-\/|G| log |G| logloglGI) steps, not far from the widely conjectured value of 6(-\/|G|). 
This improves upon a recent result of Miller- Venkatesan which showed an upper bound of 
0(-\/|G| log^ |G|). Our proof is based on analyzing an appropriate nonreversible, non-lazy ran- 
dom walk on a discrete cycle of (odd) length |G|, and showing that the mixing time of the 
corresponding walk is 0(log |G| log log |G|). 

1 Introduction 

The classical discrete logarithm problem on a cyclic group deals with computing the exponents, 
given the generator of the group; more precisely, given a generator x of a cyclic group G and 
an element y = x^, one would like to compute k efficiently. Due to its presumed computational 
difficulty, the problem figures prominently in various cryptosystems, including the Diffie-Hellman 
key exchange. El Gamal system, and elliptic curve cryptosystems. About 30 years ago, J.M. Pollard 
suggested algorithms to help solve both factoring large integers (llj and the discrete logarithm 
problem [12]. While the algorithms are of much interest in computational number theory and 
cryptography, there has been very little work on rigorous analyses. We refer the reader to [9j and 
other existing literature (e.g., [18^ [3]) for further cryptographic and number-theoretical motivation 
for the discrete logarithm problem. 

Pollard's Rho algorithm for finding discrete logarithms is based on a pseudo-random approxi- 
mation to a Markov chain on a cyclic group G. While there has been no rigorous proof of rapid 
mixing of the corresponding Markov chain of order 0{log'^ \ G\) until recently, a proof of mixing 
of order 0(log^ |G|) steps by a non-trivial argument involving characters and quadratic forms was 
provided by Miller- Venkatesan [9j. In addition, they proved that with high probability the collision 
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time is bounded by 0{y^\G\log^ \G\) in the Pollard Rho algorithm. In this paper we improve on 
this and prove the correct order mixing time bound of a closely related walk, along with a nearly 
optimal bound on the mixing time of the Pollard Rho algorithm. 

Our first approach will be an elementary proof based on canonical paths which shows the same 
0(log'^ |G|) mixing time as [9j. In fact, related methods including log-Sobolev and Spectral profile 
can show no better than 0(log^ mixing, still far from the correct bound. As such we then 
turn to a different method and next show that arguments used to study a related walk by Aldous 
and Diaconis [Ij and Chung, Diaconis and Graham [2] can be modified to apply to this problem. 
In particular, a strong stationary time is given to show 0(log|G| loglog|G|) mixing time when 
|G| = 2"* — 1 for some m, while a Fourier analysis approach can show the same bound for general 
odd order We then combine this with an improved argument on collision time of a walk, in 
showing that 0(\/|G| log |G| loglog|G|) steps suffice until a collision occurs and discrete logarithm 
is possibly found, not far from the widely conjectured value of 0( y^|G|). Finally we observe that 
our approach is robust enough to allow analysis of other variants of the Pollard Rho algorithm, such 
as those mentioned in the survey article by Teske [I8j; we will include the necessary details in the 
journal version of this manuscript. A noteworthy remark here is that the walks analyzed in [Tl [2] 
and other similar walks studied by Hildebrand [5] always double the current position (and then 
add or subtract one with some probability); the subtlety in our problem arises from the original 
(unaltered) Pollard Rho algorithm demanding that we double only with 1/3 probability. It turns 
out that this requires a more careful analysis, since standard comparison- type arguments result in 
additional logp factors in the mixing time. 

In terms of prior (and not-so-recent) history, we remark that Shoup pi] had shown that any 
generic algorithm which solves (with high probability) the discrete logarithm problem on integers 
modulo p, must perform at least ^}{^/p) group operations, where p is the largest prime dividing n. 
The notion of generic includes among others. Pollard's Rho method and Pohlig-Hellman algorithm 
(see [14] for details.) Pollard has shown that if the iterating function F gives perfectly random 
samples then the expected time until a collision occurs is in fact 0(y/p), but it is not known 
whether the form of iterating function proposed by Pollard gives a sufficient level of randomness, 
and secondly one would like to estimate such a collision time with high probability. For one of the 
variants of the Pollard Rho algorithm, see [6], wherein the authors replace the squaring step by a 
walk on a Cayley graph of the group, and obtain bounds of the form O(^), up to factors of logp. 

The paper proceeds as follows. In Section [2] the Pollard Rho algorithm is introduced, and a 
relation is shown between collision time and mixing time in separation distance. We then use the 
canonical path method to bound mixing time of this walk in Section [3l This is followed in Section 
Hlby a proof of a near optimal mixing bound in terms of strong stationary times when |G| = 2"^ — 1. 
In Section [5l a Fourier approach is used to show the same bound for the general case. Finally, in 
conclusion, we discuss limitations of various commonly used methods for this problem. 

2 Collision time of Pollard's Rho algorithm 

While the majority of analysis in this paper is devoted to studying the precise mixing time of a 
certain nonreversible, non-lazy random walk on a cycle of odd length, we first reduce the collision 
time problem (of Pollard's Rho discrete logarithm algorithm) to such a mixing time question. 
While such a reduction was already described in [9], our proposition below improves on their idea 
in yielding a smaller factor (see below for further clarification). First, let us introduce the algorithm. 

Consider a cyclic group G of prime order p = \G\ ^ 2, and suppose x is a generator, that is 
G = {x*}^~Q. Given y ^ G, the discrete logarithm problem asks us to find k such that = y. 
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Pollard suggested an algorithm on Zp based on a random walk and the Birthday Paradox. A 
common extension of his idea to groups of prime order is to start with a partition of G into sets 
/Si, 52, 5*3 of roughly equal sizes, and define an iterating function F : G — > G by F{g) = gx if 
g ^ Si, F{g) = g^ if g & S2 and F{g) = gy = gx'' if 5 G S^. Then consider the walk gi^i = F{gi). 
If this walk passes through the same state twice, say x'^^^^ = x"^'^^, then = x'^^'^"'') and 

so a — a = k{/3 — b) mod p and k = {a — a){P — b)~^ mod p, which determines k unless (3 = b 
mod p. Hence, if we define a collision to be the event that the walk passes over the same group 
element twice, then the first time there is a collision it might be possible to determine the discrete 
logarithm. 

To estimate the running time until a collision one heuristic is to treat F as if it outputs uniformly 
random group elements. By the Birthday Paradox if 0(y^|G|) group elements are chosen uniformly 
at random, then there is a high probability that two of these are the same. However, Teske |16] has 
given experimental evidence that the time until a collision is slower than what would be expected 
by a truly random process. We analyze instead the weaker idealization in which it is assumed only 
that each g ^ G \s assigned independently and at random to a partition Si, S2 or ^3. In this 
case, although the iterating function F described earlier is deterministic, because the partition of 
G was randomly chosen then the walk is equivalent to a Markov chain (i.e. a random walk), at 
least until the walk visits a previously visited state and a collision occurs. The problem is then 
one of considering a walk on the exponent of x, that is a walk R on the cycle Zp with transitions 
R(i, i + 1) « R{i, i + k)^ R(i, 2i) « 1/3. 

Recall that the event of revisiting an already visited state is called a collision. Our analysis of the 
time until a collision occurs will be done by examining the rate of convergence of the Markov chain 
to its stationary distribution vr. The separation distance between a distribution a and stationary 
distribution vr is sep{a, vr) = max^gy 

The mixing time of a Markov chain P with state 

space V is 

Ts{e) = mm{n : yx,y G V, 1 - ^-^^ < e} , 

which is the worst-case number of steps required for the separation distance to drop to e. The 
following result relates rs(l/2) to the time until a collision occurs for any Markov chain P with 
uniform distribution on G as the stationary distribution. 

Proposition 2.1. With the above definitions, after 



1 + T,(l/2) + 2 v/2c|G|r,(l/2) 
steps, a collision occurs with probability at least 1 — e~'^, for any c > 0. 

Proof. Let S denote the first -y/2c |G| rs(l/2) states visited by the walk. If two of these states 
are the same then a collision has occurred, so assume all states are distinct. Even if we only check 
for collisions every Ts{1/2) steps, the chance that no collision occurs in the next tTs{l/2) steps (so 
consider t semi-random states) is then at most 

\ ( l cTs{l/2) \' ^ .t. 

2|G|y ^ I V 2|G| 



< e 



2\G\ 



When t 



2c|G| 
^^(1727 



this is at most e as desired, and so at most 



V2c|G|r,(l/2) 



+ 



2c\G\ 



Ts(l/2) 



Ts(l/2) 

steps are required for a collision to occur with probability at least 1 — e" 



□ 
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Remark 2.2. By assuming each g ^ G is assigned independently and at random to a partition we 
have eliminated one of the key features of the Pollard Rho algorithm, space efficiency. However, if 
the partitions are given by a hash function f : {G,p) — > {1, 2, 3} which is sufficiently pseudo-random 
then we might expect behavior similar to the model with random partitions. 

Remark 2.3. The analysis can easily be extended to the case when there are many partitions, of 
varying sizes, each with their own transition rule, as long as a partition occupying at least a constant 
fraction of the space corresponds to g ^ gx and another such partition corresponds to g ^ g"^ . 

Throughout the analysis in the fohowing sections, we assume that the size p of the cycle Zp (on 
which the random walk is performed) is odd. Indeed there is a standard reduction - see [13] for 
a very readable account and also a classical reference [10] - justifying the fact that it suffices to 
study the discrete logarithm problem on cyclic groups of prime order. 



3 Canonical Paths 



Perhaps the most widely used approach to bounding mixing times is the method of canonical paths. 
However, this method has been used primarily for walks which are either lazy or reversible, and 
usually both. The Pollard Rho walk is neither, but as we will now see, it is still possible to apply 
the canonical path method. 

Canonical path methods rely on studying the spectral gap: 

Definition 3.1. Given Markov chain P on state space V the spectral gap A = Ap is defined by 

AP= mf 

/;y->R, Var^(/)' 

Var^(/)7^0 

with Var,r(/) = Evr/^ — (Ett/)^ and Dirichlet form 

£p{f,f) = \ E (/W-/(y))Mx)P(x,y). 

x,y&V 

Fill [3], building on work of Mihail [8j, showed a bound on the mixing time. 
Theorem 3.2. The mixing time of a finite Markov chain P on state space V is at worst 



Tsie) < 



1 1 1 
log 

App. evTo 



where ttq = miuajgy 7r(x) and the time reversal P* is given by P*(x,y) = ""^^^^^^'^^ . 
One of the more common ways of bounding the spectral gap is via canonical paths [15 



Theorem 3.3. Consider a finite Markov chain P on state space V . For every x, y £ V , x ^ y, 
define a path ^xy from x to y along edges of P (i.e. ^xy d E = {(a, 6) £ V x V : P{a,b) > 0}). 
Then ^ 

A > ( max — , — — > 7r(x)7r(?/)|72^o,| 

{x,y):x^y, 

{a,b)e-fa:y 
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It suffices to bound the mixing time of the walk R^, because the mixing time of R is at most 
twice this. 

Lemma 3.4. Let K(i, 2i) = K(i, 2i — 1) = 1/2 6e a walk on the odd cycle Zp. The Pollard Rho walk 
R satisfies 

2 

Ar2(r2). > — Ak . 

Proof. Observe that, from the definition of spectral gap, 

Vi / J : R2(R2)*(i, j) > c K(i, j) ^ Ar2(r2). > c Ak- 
Now, K(z,j) 7^ only if j = 2i — 1 or j = 2i, so it suffices to consider these transitions: 

R2(R2)*(i,2i-l) 

> R{i, 2i)R(2i, 2i + l)R*(2i + 1, 2i)R*(2i, 2i - 1) 

> |-K(i,2z-1), 

R2(R2)*(i,2i) 

> R(i,z + l)R(i + l,2i + 2) 
R*(2i + 2,2i + l)R*(2i + l,2i) 

> — K(i,2i). 

Here we are using the fact that (R^)* = (R*)^. □ 
Lemma 3.5. Let K(i,2i) = K(z,2z — 1) = 1/2 he a walk on the odd cycle "Lp. Then, 

Ak > ^- o • 

2 (flogs pD' 

Proof. Suppose x,y G V and let n = [logsp]. To construct a path from x to y, let xq = x and 
consider all possible paths of length n, i.e. x = — > xi ^ • • • — > 3;„ with Xj = 2xi_i — q and 
a G {0,1}. Then 

n 

x„ = 2"xo - ^ 2"-* Q mod p. (3.1) 

Each value in {0, 1, . . . , 2" — 1} can be written in exactly one way as a sum 'Y^^=i Cj, and so 
there are either one or two possible paths from xq to x„ = y. Pick one as the canonical path 

To apply Theorem 13.31 fix edge (a, 6) with K(a,6) > and suppose that (a, 6) is the i-th edge 
in path ^xy Then x G {2: : V\~^{z, a) > 0} and y ^ {z : K"~*(&, z) > 0}, and so there are at most 
\{z : K*-i(z,a) > 0}| x \{z : K"-*(6,z) > 0}| < 2'-^ x 2""^ = 2"-^ < p such paths. There are 
n = [logsp] possible values of i, so there are at most p x [logsp] paths through this edge, each of 
length \jxy\ = [logsPl- Theorem 13.31 completes the proof. □ 

It follows from Theorem 13. 2^ Lemma 13.41 and Lemma 13.51 that Ts{e) = 0((logp)^ log(p/e)). 

The preceding argument was based on the observation that studying R"^ for some m > 1 may 
be easier than studying the walk R directly, as was done in [9j. In Concluding Remarks we sketch 
an argument for why this approach cannot be used to show better than rs(l/2) = 0((logp)^/m) 
for the Pollard Rho walk. 
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4 Separation via Stopping Time 



While canonical paths are much more widely used for bounding mixing time, the most direct route 
for bounding the separation distance is via a strong stationary time. 

Definition 4.1. A stopping time for a random walk {5^}^o ^ random variable T G N such that 
the event {T < t} depends only on Yq, Yi, . . . , Yf. A stopping time T is a strong stationary time 
with stationary distribution vr, if 

Vy, PT[Yt = y\T < t] = Ti{y) . 

The key point here is that 

P^Yt = y] > Pr[r < t] FT[Yt = y\T < t] 
= TT{y)Fr[T<t] 

and so ^ 

sep(P\x, •),vr) = max 1 - < PrfT > t] . 

ydV 7r(y) 

We first consider the case p = 2^ — 1. The construction can be thought of as an extension of 
the approach of Aldous and Diaconis [T]. In the following section a similar bound for general odd 
p will be shown. 

Theorem 4.2. If p = 2™ — 1 then the Pollard Rho walk on the cycle Zp has mixing time 

Ts{l/2) = 0{logp log log p) . 

Proof. The key to the proof will be to reduce the problem to one of constructing a strong stationary 
time for a walk with transitions i ^ 2i + and i ^ 2i + 1 , each with equal probability. 

Let us refer to the three types of moves that the Pollard Rho random walk makes, namely 
{i, i + 1), (z, i + k), and (i, 2i), as moves of Type 1, Type 2, and Type 3, respectively. In general, let 
the random walk be denoted by Iq) ^i) ^) • • • ; with Yt indicating the position of the walk (modulo 
p) at time t > 0. 

Define new random variables Ti, hi and Xi: Let Xq = and Tq = 0. Let Ti be the first time, 
after time 0, that the walk makes a move of Type 3. Let hi = l^i-i — (i-^-! the ground 
covered, mod p, only using consecutive moves of Types 1 and 2.) Let Xi = 2Xq + hi. (Thus 
Xi = — Ytq.) More generally, let Tj be the first time, since Tj-i, that a move of Type 3 

happens. Let hi = Ir—i — ^^-u ai^d let Xi = 2Xi-i + hi = It--! — ^Tq- Observe that Tj, for each 
z, is a valid stopping time. 

Auxiliary Randomness: For the sake of the analysis, we generate the above random walk using an 
auxiliary random process: at each time step t > 0, we generate an integer Rt uniformly at random 
from the set of integers [1..9]. The integers 1,2,3 are associated with (or interpreted as) a move of 
Type 1, and the integers 4,5,6 with Type 2, and finally the integers 7,8,9 with a move of Type 3. 

Define History: To keep the independence of random variables transparent, it is best to associate 
history vectors ffj, with the random walk as follows. The entries of the history vector are from 
[1..9]; every time a doubling move happens, we stop the current history vector (after recording 
the current Rt^ value), and start growing a new vector. Thus Hi = {Ri, ...,Rti), and in general. 
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Hi = -^+1, ...jRt^), and note that the history vector always ends in a 7, 8, or a 9, since those 
are identified with a Type 3 move. 

Special history vectors: We call certain history vectors of length one or two, as special: H is 
special if H = (7) or H = (a, b), where a £ {1, 2, 3}, and b £ {7, 8, 9}. Note that given the history 
vectors and Yq, all other (random) variables, Yi,bi,Ti, Xi, can (uniquely) be determined. Moreover, 
if a history Hi were special, then it implies that the corresponding bi equals or 1, depending on 
whether Hi = (7) or (a, b), respectively; in the latter case, a being 1, 2, or 3 implies that a move of 
Type 1 took place before the doubling, and hence the ground covered is simply +1. 

This completes the set up. 

The Actual Analysis: 

Let s = rm. (Recall that m = log2(p+ 1); we will choose later r = clog2 m, for c > a suitable 
constant.) Consider 

Xs = 2'-^bi + 2^-262 + • • • + 2%s , 
which may be rewritten (using modulo p) as 

Xs = 2™" ^{bi + bm+l + b2m+l + ■■■ + b(^r-l)m+l) 

+2*" 2(^2 + bm+2 + b2m+2 + ••• + &(r-l)m+2) 
+ ■■■ + 2\bm + b2m + ■■■ + 

In other words, if we refer to each set of terms inside the parentheses as a Block then there are 
m Blocks, each associated with 2* for i = m — l,m — 2,...,0. 

Define Auxiliary random variables using Special History: Recall that each history vector Hi pro- 
duces a bi. Let Ci = bjm+i, where j £ {0, 1, (r — 1)} is the first (smallest) index in Block 1 such 
that bjm+i comes from a special history. More generally, for i = 1,2, ...,m — 1, let Q = bjm+i-, 
where j G {0, 1, (r — 1)} is the first (smallest) j such that bjm+i comes from a special history. If 
no such j were present (which is possible, since there need not be any occurrences of special history 
in the corresponding interval), then denote such a Ci to be oo (or undefined.) 

By the remarks above, each Ci (once defined) is or 1, and moreover each Ci is an independent 
(of all the other Cj's) Bernoulli trial, since the corresponding 6j's are mutually independent. Then 
we may rewrite Xs as follows: 

Xs = 2"~iC7i + 2'"-2C2 + • • • + 2°Cm 

+2"-i(Resti) + 2™-2(Rest2) + • • • + (Rest^), 

where Restj is the sum (over j) of bjm+i minus the special b that became Ci. 

The Basic Dyadic Randomness argument from fl^: What is relevant or important here is that if 
all Ci are defined then 

Xs = 2'"-^C7i + 2"'-2C2 + --- + 2°C„ + REST 
=: Sm + REST, 

where, as we will see shortly, the first part (Sm) randomizes Xs so that the REST will not matter; 
more formally, if Sm 7^ oo then 

Pt[Xs = w] 

= ^Pt[REST = R]FiiSm + R = w\REST = R], 
R 
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and Pv[Sm = w- /?|REST = R] = 1/2™ except that \i w - R = Q then Pr[Sm = 0|REST = R] = 
2/2™^ (when all Cj's are or all 1). This holds even if we condition on Tg = constant as well. 

Consider the stopping rule T such that T = Trm for the first r for which all Cj are well-defined, 
except that if all Cj are 1 then set all Cj to undefined and begin the process again. The above 
shows that this is a strong stationary time for the Rho walk. 

It remains to bound the time until all Ci are well-defined, which (by coupon-collector intuition) 
should be roughly mlogm: Observe that for a fixed i, 

Pr[Ci = oo] = (1 - 2/9Y . (4.2) 

since Fi[appropriate history Hi is special] = 1/9 + 1/9, and each of the r (independent!) possibil- 
ities in the ith. Block ought to have been unsuccessful. 

So as long as r > (1 + (^)(logm)/log(9/7), the probability in (j4.2p is at most l/m^^^^\ Hence, 
Fi[all d e {0, 1}] > 1 - m/m^+^ = 1 - m'^, and so Pr[a// d G {0, 1} n not all Ci = I] > 
(1 — 1/2'")(1 — 1/m^). For s = rm with r = [3(logm)/log(9/7)], we have 

Pr[T < Ts\ > (1 - 1/2'")(1 - l/m^) . 

A Type 3 move occurs on average every 3 steps and so by Markov's inequality since i?[Ts] = 3s 
then Vt[Ts > 9s] < 1/3. Hence, if k = 9m\^^^^] then 

Pr[r < A;] > Pr[r < T^] - Pr[r, > k] 

> (l-l/2'")(l-l/m2)-l/3>l/2. 

□ 

In Concluding Remarks it will be shown that the mixing bound rs(l/2) = O (log p log log p) 
found here for Xs is of the correct order. 



5 Fourier Analysis 

We now turn to the general case of p odd, where we work with Fourier analysis. The construction 
can be thought of as an application of the ideas of Chung, Diaconis and Graham [2]. 

In the previous two sections a bound on mixing time of the Rho walk was used to derive a 
bound on collision time. This time we consider a "block walk" in which a single step corresponds 
to a Rho walk truncated on an i — > 2i step, i.e. a walk Zi where Zi = 2(Zj_i -|- bi) with bi defined 
as in Section m Note that in t steps of the Rho walk the expected number of block steps is t/3, 
and Chebyshev's Inequality shows that Prob{#block steps < t/4) < Hence, if r^(l/2) denotes 
the mixing time of the block walk, then in t = 4{1 + r^^(l/2) + 2 ^/2c\G\ t^{1/2)) steps of the Rho 
walk a collision occurs with probability at least 1 — e~'^ — Prob{^block steps < t/A) > 1 — e~'^ — 

To bound mixing time of the block walk it suffices to show that for large enough s the distribution 
Vs of 

Xs = 2'-^bi + --- + b, 
is close to the uniform distribution u. More precisely, we will show that 

P-i 

pY.^^s{j) - u{j)f < 2 ((1 + fVsM)m~i _ , (5.3) 
i=o 
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where Us{j) = Pr[Xs = i], C = 1 - '^"9 , and m satisfies 2™^^ < p < 2™. This suffices to bound 
the separation distance, as shown in Remark 15.21 at the end of the section. 

The proof uses the standard Fourier transform and the Plancherel identity: For any complex- 
valued function / on Zp and uj = e^'^*/^, recall that the Fourier transform / : Zp — > C is given by 

p-i 

/(^) = ^^'^^■'/(j)) s-iid the Plancherel identity asserts that 

j=0 

p—i p—i 

j=0 j=0 

For the distribution /u of a Zp- valued random variable X, its Fourier transform is 

p-i 

j=0 

Thus, for the distributions /u^,/i2 of two independent random variables ¥±,¥2, the distribution of 
X := Yi + Y2 has the Fourier transform v = fi^jl^, since 



Generally, the distribution u oi X := Yi + • • • + 1"^ with independent 1^'s has the Fourier transform 
z) = rij=i Aj - Moreover, for the uniform distribution u, it is easy to check that 



u{C) 



1 if^ = 0, 
otherwise. 



As the random variables 2^bs-j^s are independent, = Ylj^^fJ^j, where fij are the distributions of 
2^bs-j. The linearity of the Fourier transform and ^'s(O) = £'[!] = ! yield 



if ^ = 

11^=0 /^j(^) otherwise. 



l^s - U{i) = - Uii) -- 

By Plancherel's identity, it is enough to show that 
Lemma 5.1. 

I < 2 ((1 + e2Ls/H)— 1 _ 1) . 

Proof. Let Aj be the event that bs-.j = or 1. Then, 

iij{e) = E[J^'^^-^] 

= Pr[6,_j = 0] + Pr[6,_,- = 
+ Fv[Aj]E[J'^'^'-^\Aj], 

and, for x := Pr[6s_j = 0] and y := Pr[6s-j = 1], 

I flj m < \x + yJ^' \ + {l-x-y)\E [J^' '^-^ \ A, 

I f?^ I 

< \x + yu! \ + l — x — y. 
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Notice that 

If cos < 0, then 

< (x2 + y')'/' + l-x-y 

= l-(x + y-(x2 + y2)i/2) 

Since x = Pr[6s-j = 0] > 1/3 and y = Fi[bs-j = 1] > 1/9, it is easy to see that x + y — {x^ + y"^)^^^ 
has its minimum when x = 1/3 and y = 1/9. (For both partial derivatives are positive.) Hence, 

< e = 1 - provided cos ^ < 0. 

If cos > 0, we use the trivial bound fij{£) = E[uj^^"'--^] < 1. 

For £ = I, ...,p — 1, let (t)s{£) be the number of j = 0, s — 1 such that cos < 0. Then 



s-l 



< i^'^'^ ■ (5.4) 



To estimate (j)s{t), we consider the binary expansion of 



- ry nj • • • n • • • 

OL^ . € {0, 1} with ^ = infinitely often. Hence, £/p = X^j^i ^""'ck^ j ■ The fractional part of 12^ /p 
may be written 

{£2^ Ip] = •a,,,.+ia,,,.+2 • • • • • • • 

Notice that cos "^^^^ < if the fractional part of £2^ jp is (inclusively) between 1/4 and 3/4, which 
follows if a^.^j 7^ Thus, (/'s(^) is at least as large as the number of alterations in the sequence 

We now take m such that 2"^~^ < p < 2™. Observe that, for ^ = 1, — 1, the subsequences 
a(£) := (a^ ^ , 2! ■••) m) of length m are pairwise distinct: If a{£) = a(£') for some £ < £' 
then is less than X]j>m+i 2""' < 2""^, which is impossible as p < 2™. Similarly, for fixed j 
and ^ = l,...,p — 1, all subsequences a{£;j) := {a^ .^^) are pairwise distinct. In 

particular, for fixed r with r = 0,..., [s/mj — 1, all subsequences a{£;rm), £ = l,...,p — 1, are 
pairwise distinct. Since the fractional part {^-^} = -(^i rm+i'^i rm+2 ''' ™ust be the same as ^ 
for some £' in the range 1 < £' < p — 1, there is a unique permutation cr,. of 1, ...p — 1 such that 
a{£;rm) = a{ar{£))- Writing \a{ar{£))\^ for the number of alternations in a{ari()), we have 

[s / mj — 1 

where is the identity. Therefore, (|5.4p gives 

p-l ^ s-l „ p-1 



=1 i=o e=i 



10 



Using 

^x+y _|_ ^x'+y' 

^ ^min{x,x'}+min{y,y'} _|_ ^max{a;,x'}+max{j/,y'} 

inductively, the above upper bound may be maximized when all a^s are the identity, i.e., 

e=i j=o i=i 

Note that l/p < £/p < 1 — 1/p implies that a{£) is neither (0, 0) nor (1, 1) (both are of length 
m). This means that all a{i) have at least one alternation. Since a{£)'s are pairwise distinct, 

p-i 

i=l a:|a|^>0 

where the sum is taken over all sequences a G {0, 1}™ with |a|^ > 0. 
Let H{z) be the number of a's with exactly z alterations. Then 



m — 1 

z 



and hence 



^ ^2[s/m\\a\^ 
a:|"U>0 



m— 1 

Z = l 



m — 1 



2 [s/ m\ z 



2 + e 



2[s/mJ \ m—l 



1 . 



□ 



Remark 5.2. To show a bound on the separation distance, we use Cauchy- Schwartz: 

p2-(x,y)-7r(y) ^ 



E.(P^(x,z)-7r(z))(P^(z,y)-vr(y)) 



7r(y) 



PHx,z) 



tt{z) 



P-(y,z) 



<^7r(z) 



7r(2;) 



1 



■k{z) 



tt{w) 



1 



For the "block walk" the first sum after the inequality is equal to the quantity upper bounded in 
equation (15. 3p . while the second is the same quantity but for the time-reversed walk P*{a,b) = 
7r(6)P(6, a)/7r(a). To bound the mixing time of the reversed walk let b* denote the sum of steps 
taken by R* between the {i — l)-st and ith time that j — > j/2 is chosen (i.e. step size taken by 
time-reversed block walk), let X* = 2"^^^ + • • • + 6* and let bi = —b*. Then 



Fi[-2'-'X* = j] = Pr[6i + 262 + --- + 2 



s-li 
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because the hi are independent random variables from the same distribution as the blocks of R. It 
follows from ()5.3p that 

1 _ Pl-[^2s = y] ^ 2 ('(1 +^2L./mJ)m-l _ ^ 



and so after 2s ~ m log^ ^^i~T — 2m log ^^'"^ blocks the separation distance drops to e. 

Remark 5.3. In recent work with Yuval Peres to appear in the journal version of this paper, we 
build on techniques from this paper and an idea from /?/ and manage to improve the collision time 
bound to the conjectured value of&{^/p). The argument is based on showing that in Q{^/p) steps the 
number of collisions S in the Xg walk satisfies E{S)'^ = Q{E{S'^)), and so with constant probability 
there is a collision. This was in turn shown by re-writing E(S'^) in terms of a quantity closely 
related to the Plancherel identity appearing in our Fourier proof of mixing time. 

6 Concluding Remarks 

We sketch here some reasoning for why many common methods for bounding mixing times will not 
be useful in showing the optimal mixing bound in separation distance for the Pollard Rho walk. 
A coupling argument bounds only the weaker total-variation distance, i.e. shows only that 

max7r(^) - PrlXt e A] < e . 

AcG 

To bound Tg{l/2) with this requires e < l/2p, which typically increases the mixing bound by a 
multiplicative factor of log(l/e). Total variation mixing time r(l/2) is trivially at least log3P — 1, 
and so this gives a separation bound of at best O(log^p). Alternatively, re- working the collision 
time argument in terms of variation distance results in a ^/\ogp loss. 

When working with spectral gap, spectral profile, log-Sobolev and Nash inequalities a weakness 
arises in that mixing bounds in terms of these quantities are based on studying the rate of decay of 
variance. As such these do a poor job of distinguishing mixing time of a non-reversible walk P from 
that of its additive reversibilization P' = , or lazy additive reversibilization P" = ^-\- . The 
lazy additive reversibilization R" = ^ + ^"^^ of the Pollard Rho walk mixes in time Ts^ji"{1/2) = 
r2(log^p) (see below), and so we expect that the aforementioned methods for bounding mixing time 
of R will do no better than this. 

More precisely, the mixing time bounds involving these quantities can be shown by using the 
relation Var,r(^x^^) — Var,r(^x) = —£pp* {kl., A:* ) for the t-step density A;* of a walk started at state x. 
By a comparison argument it can be shown that if k = p — 1 and m> 2 then iS^R/z-j^R//-)* (/, /) > 
i5RmR*m(/, /). Heuce these Dirichlet form based methods will show a mixing time bound on R*" 
which is no better than times faster than the corresponding upper bound on the mixing time 
of R". To bound the mixing time of R" let n = ^ [log2pJ, x = 1/2" G Zp and T = ([log2pJ)^. 
It can then be shown that {R")^{x, S) > 7/9 where 




2i2"+3\/i72-i 

5= u u 



1+j 
2n-i 



But tt{S) < i (6^172 + l)(r2"+3v^/2 + 1) < 1/8 and so for some y S we must have 



(RT(^ . (RTM!)^./in 
Ay) - AS^) ^ 
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It follows that R'(l/2) >T = Q{log^ p). The corresponding mixing bound on R"* can then lead 
to a bound of at best 

rs,R(l/2) < mr,,R^(l/2) = 0{{logpf/m) . 

Hence if m <C ipg'kigp then none of these methods will match our 0(logp loglogp) mixing bound. 

It thus appears that to show a better than O(log^p) mixing time bound it will be necessary 
to use a more specialized method, such as a more refined operator technique or computation 
involving a high power of the transition probability matrix. Two methods involving high powers 
were considered in this paper, a strong stationary time and a Fourier analysis argument. 

We now turn to limitations of the block approach of working with Xg which was taken in our 
strong stationary time and Fourier analysis arguments. As with the random walk considered by 
[U [2] , we might expect that the correct order of the mixing time of the Xs walk considered in this 
paper is indeed (log p log log p), at least for p of the form 2"* — 1 and k = p — 1. This is in fact 
the case by an argument fairly similar to that of Section 4 "A proof of Case 2" of [5|, which in 
turn closely follows a proof of [2j. The basic idea is by now fairly standard: choose a function and 
show that its expectation under the stationary distribution and under the n-step distribution P" 
are far apart, with sufficiently small variance to conclude that the two distributions (P*^ and vr) 
must differ significantly. 

In keeping with notation of ^ , suppose p = 2* — 1 and let k denote a variable over Z (no longer 
the exponent y = x^). The "separating function" of interest / : Zp ^ C in this case is 

t-i 

f{k) := q^'^' where q = e^^''^ . 
j=0 

Then Eu{f) = if p > 1, and Ejj{ff) = t and so Var7r(/) = t where ir = U denotes the uniform 
distribution. As in [5], ii n = rt and Pn denotes the distribution after n steps of the block walk 
(i.e. lo + Xn) (the analysis uses r = Jlogt — d G N for some fixed 6), then Ep^{f) = 111^ and 
EpMf) = t where H, = Pt{2^ - 1), and so Varp„(/) = t ET=o^ " ^'l^iP^ 

In order to bound variance and expectation we must approximate the Hj. To do this, recall 
that Xi = 2Xi-i + bf, a generic such increment will be denoted by b, since the bi are independent 
random variables from the same distribution. Let = Pr[6 = k] = Pr[6 = —k]. This satisfies the 
recurrence relation 

1 , , 12 

ak = -^{cLk-i + ak+i), "^o^ 3 + 3^1' ('■oo = 

■ It will also be useful to introduce a bit of 
notation. If < j < t — 1 then define /^^(x) = Pr[6 = x2^"] = a^2-'^ 



a=0 \ 1" / 



where 



1- 



1+ (^^)^ - (3-^/5) cos(2^x) 



The remainder of the argument differs little from that of [5j. There is a small mistake in the 
proof of Claim 1 in [5], but it does not effect the proof for the Rho walk. 
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